Goto

Collaborating Authors

 arima model


Galerkin-ARIMA: A Two-Stage Polynomial Regression Framework for Fast Rolling One-Step-Ahead Forecasting

Liu, Haojie, Lin, Zihan

arXiv.org Machine Learning

We introduce Galerkin-ARIMA, a novel time-series forecasting framework that integrates Galerkin projection techniques with the classical ARIMA model to capture potentially nonlinear dependencies in lagged observations. By replacing the fixed linear autoregressive component with a spline-based basis expansion, Galerkin-ARIMA flexibly approximates the underlying relationship among past values via ordinary least squares, while retaining the moving-average structure and Gaussian innovation assumptions of ARIMA. We derive closed-form solutions for both the AR and MA components using two-stage Galerkin projections, establish conditions for asymptotic unbiasedness and consistency, and analyze the bias-variance trade-off under basis-size growth. Complexity analysis reveals that, for moderate basis dimensions, our approach can substantially reduce computational cost compared to maximum-likelihood ARIMA estimation. Through extensive simulations on four synthetic processes-including noisy ARMA, seasonal, trend-AR, and nonlinear recursion series-we demonstrate that Galerkin-ARIMA matches or closely approximates ARIMA's forecasting accuracy while achieving orders-of-magnitude speedups in rolling forecasting tasks. These results suggest that Galerkin-ARIMA offers a powerful, efficient alternative for modeling complex time series dynamics in high-volume or real-time applications.


Load Forecasting in the Era of Smart Grids: Opportunities and Advanced Machine Learning Models

Maneshni, Aurausp

arXiv.org Artificial Intelligence

Electric energy is difficult to store, requiring stricter control over its generation, transmission, and distribution. A persistent challenge in power systems is maintaining real-time equilibrium between electricity demand and supply. Oversupply contributes to resource wastage, while undersupply can strain the grid, increase operational costs, and potentially impact service reliability. To maintain grid stability, load forecasting is needed. Accurate load forecasting balances generation and demand by striving to predict future electricity consumption. This thesis examines and evaluates four machine learning frameworks for short term load forecasting, including gradient boosting decision tree methods such as Extreme Gradient Boosting (XGBoost) and Light Gradient Boosting Machine (LightGBM). A hybrid framework is also developed. In addition, two recurrent neural network architectures, Long Short Term Memory (LSTM) networks and Gated Recurrent Units (GRU), are designed and implemented. Pearson Correlation Coefficient is applied to assess the relationships between electricity demand and exogenous variables. The experimental results show that, for the specific dataset and forecasting task in this study, machine learning-based models achieved improved forecasting performance compared to a classical ARIMA baseline.


Multi-Modal Traffic Analysis: Integrating Time-Series Forecasting, Accident Prediction, and Image Classification

M, Nivedita, S, Yasmeen Shajitha

arXiv.org Artificial Intelligence

--This paper presents a comprehensive framework that integrates multiple machine learning techniques for advanced traffic analysis. Our approach combines (1) an ARIMA(2,0,1) model for time-series forecasting, achieving a Mean Absolute Error (MAE) of 2.1; (2) an XGBoost classifier for accident severity prediction with 100% accuracy on balanced datasets; and (3) a Convolutional Neural Network (CNN) architecture for traffic image classification, achieving 92% accuracy. These methods were rigorously tested on heterogeneous datasets, demonstrating significant improvements over baseline models. Feature importance analysis revealed key contributing factors, such as weather conditions and road infrastructure, to accident severity. This research lays the groundwork for future advancements in intelligent transportation systems. Urban traffic management is a critical challenge in modern cities, where growing populations and increasing vehicle densities exacerbate congestion and safety issues.


Exploring Patterns Behind Sports

Liu, Chang, Ma, Chengcheng, Zhou, XuanQi

arXiv.org Artificial Intelligence

This paper presents a comprehensive framework for time series prediction using a hybrid model that combines ARIMA and LSTM. The model incorporates feature engineering techniques, including embedding and PCA, to transform raw data into a lower-dimensional representation while retaining key information. The embedding technique is used to convert categorical data into continuous vectors, facilitating the capture of complex relationships. PCA is applied to reduce dimensionality and extract principal components, enhancing model performance and computational efficiency. To handle both linear and nonlinear patterns in the data, the ARIMA model captures linear trends, while the LSTM model models complex nonlinear dependencies. The hybrid model is trained on historical data and achieves high accuracy, as demonstrated by low RMSE and MAE scores. Additionally, the paper employs the run test to assess the randomness of sequences, providing insights into the underlying patterns. Ablation studies are conducted to validate the roles of different components in the model, demonstrating the significance of each module. The paper also utilizes the SHAP method to quantify the impact of traditional advantages on the predicted results, offering a detailed understanding of feature importance. The KNN method is used to determine the optimal prediction interval, further enhancing the model's accuracy. The results highlight the effectiveness of combining traditional statistical methods with modern deep learning techniques for robust time series forecasting in Sports.


Forecasting S&P 500 Using LSTM Models

Pilla, Prashant, Mekonen, Raji

arXiv.org Artificial Intelligence

With the volatile and complex nature of financial data influenced by external factors, forecasting the stock market is challenging. Traditional models such as ARIMA and GARCH perform well with linear data but struggle with non-linear dependencies. Machine learning and deep learning models, particularly Long Short-Term Memory (LSTM) networks, address these challenges by capturing intricate patterns and long-term dependencies. This report compares ARIMA and LSTM models in predicting the S&P 500 index, a major financial benchmark. Using historical price data and technical indicators, we evaluated these models using Mean Absolute Error (MAE) and Root Mean Squared Error (RMSE). The ARIMA model showed reasonable performance with an MAE of 462.1, RMSE of 614, and 89.8 percent accuracy, effectively capturing short-term trends but limited by its linear assumptions. The LSTM model, leveraging sequential processing capabilities, outperformed ARIMA with an MAE of 369.32, RMSE of 412.84, and 92.46 percent accuracy, capturing both short- and long-term dependencies. Notably, the LSTM model without additional features performed best, achieving an MAE of 175.9, RMSE of 207.34, and 96.41 percent accuracy, showcasing its ability to handle market data efficiently. Accurately predicting stock movements is crucial for investment strategies, risk assessments, and market stability. Our findings confirm the potential of deep learning models in handling volatile financial data compared to traditional ones. The results highlight the effectiveness of LSTM and suggest avenues for further improvements. This study provides insights into financial forecasting, offering a comparative analysis of ARIMA and LSTM while outlining their strengths and limitations.


Assessing and Predicting Air Pollution in Asia: A Regional and Temporal Study (2018-2023)

Rahman, Anika, Khatun, Mst. Taskia

arXiv.org Artificial Intelligence

This study analyzes and predicts air pollution in Asia, focusing on PM 2.5 levels from 2018 to 2023 across five regions: Central, East, South, Southeast, and West Asia. South Asia emerged as the most polluted region, with Bangladesh, India, and Pakistan consistently having the highest PM 2.5 levels and death rates, especially in Nepal, Pakistan, and India. East Asia showed the lowest pollution levels. K-means clustering categorized countries into high, moderate, and low pollution groups. The ARIMA model effectively predicted 2023 PM 2.5 levels (MAE: 3.99, MSE: 33.80, RMSE: 5.81, R: 0.86). The findings emphasize the need for targeted interventions to address severe pollution and health risks in South Asia.


FlowScope: Enhancing Decision Making by Time Series Forecasting based on Prediction Optimization using HybridFlow Forecast Framework

Boyeena, Nitin Sagar, Kumar, Begari Susheel

arXiv.org Artificial Intelligence

Time series forecasting is crucial in several sectors, such as meteorology, retail, healthcare, and finance. Accurately forecasting future trends and patterns is crucial for strategic planning and making well-informed decisions. In this case, it is crucial to include many forecasting methodologies. The strengths of Auto-regressive Integrated Moving Average (ARIMA) for linear time series, Seasonal ARIMA models (SARIMA) for seasonal time series, Exponential Smoothing State Space Models (ETS) for handling errors and trends, and Long Short-Term Memory (LSTM) Neural Network model for complex pattern recognition have been combined to create a comprehensive framework called FlowScope. SARIMA excels in capturing seasonal variations, whereas ARIMA ensures effective handling of linear time series. ETS models excel in capturing trends and correcting errors, whereas LSTM networks excel in reflecting intricate temporal connections. By combining these methods from both machine learning and deep learning, we propose a deep-hybrid learning approach FlowScope which offers a versatile and robust platform for predicting time series data. This empowers enterprises to make informed decisions and optimize long-term strategies for maximum performance. Keywords: Time Series Forecasting, HybridFlow Forecast Framework, Deep-Hybrid Learning, Informed Decisions.


Navigating Inflation in Ghana: How Can Machine Learning Enhance Economic Stability and Growth Strategies

Baidoo, Theophilus G., Obeng, Ashley

arXiv.org Artificial Intelligence

Inflation remains a persistent challenge for many African countries. This research investigates the critical role of machine learning (ML) in understanding and managing inflation in Ghana, emphasizing its significance for the country's economic stability and growth. Utilizing a comprehensive dataset spanning from 2010 to 2022, the study aims to employ advanced ML models, particularly those adept in time series forecasting, to predict future inflation trends. The methodology is designed to provide accurate and reliable inflation forecasts, offering valuable insights for policymakers and advocating for a shift towards data-driven approaches in economic decision-making. This study aims to significantly advance the academic field of economic analysis by applying machine learning (ML) and offering practical guidance for integrating advanced technological tools into economic governance, ultimately demonstrating ML's potential to enhance Ghana's economic resilience and support sustainable development through effective inflation management.


An Evaluation of Standard Statistical Models and LLMs on Time Series Forecasting

Cao, Rui, Wang, Qiao

arXiv.org Artificial Intelligence

This research examines the use of Large Language Models (LLMs) in predicting time series, with a specific focus on the LLMTIME model. Despite the established effectiveness of LLMs in tasks such as text generation, language translation, and sentiment analysis, this study highlights the key challenges that large language models encounter in the context of time series prediction. We assess the performance of LLMTIME across multiple datasets and introduce classical almost periodic functions as time series to gauge its effectiveness. The empirical results indicate that while large language models can perform well in zero-shot forecasting for certain datasets, their predictive accuracy diminishes notably when confronted with diverse time series data and traditional signals. The primary finding of this study is that the predictive capacity of LLMTIME, similar to other LLMs, significantly deteriorates when dealing with time series data that contain both periodic and trend components, as well as when the signal comprises complex frequency components.


Optimization of Energy Consumption Forecasting in Puno using Parallel Computing and ARIMA Models: An Innovative Approach to Big Data Processing

Vilca-Tinta, Cliver W., Torres-Cruz, Fred, Quispe-Morales, Josefh J.

arXiv.org Machine Learning

This research presents an innovative use of parallel computing with the ARIMA (AutoRegressive Integrated Moving Average) model to forecast energy consumption in Peru's Puno region. The study conducts a thorough and multifaceted analysis, focusing on the execution speed, prediction accuracy, and scalability of both sequential and parallel implementations. A significant emphasis is placed on efficiently managing large datasets. The findings demonstrate notable improvements in computational efficiency and data processing capabilities through the parallel approach, all while maintaining the accuracy and integrity of predictions. This new method provides a versatile and reliable solution for real-time predictive analysis and enhances energy resource management, which is particularly crucial for developing areas. In addition to highlighting the technical advantages of parallel computing in this field, the study explores its practical impacts on energy planning and sustainable development in regions like Puno.